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ABSTRACT 

Two  established  methods  of  code  improvement ,  Day  [U]  and 
Kildall  [7],  are  reviewed.  The  problems  of  optimal  register 
allocation  are  discussed,  A  method  is  presented  using 
Kildall's  [7]  optimization  algorithm  for  specifying  the 
active  data  items  in  a  program.  Demonstration  of 
particular  problems  with  register  allocation  are  presented. 
Topics  for  further  consideration  in  a  complete  solution  are 
discussed. 
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II.   BACKGROUND 

The  advent  of  higher  order  languages  began  the  era  of 
compilers  and  subsequently  optimizers.  The  problem  with 
compilers  in  general  is  that  more  efficient  code  can  often 
be  written  by  an  assembly  level  programmer.  Efficiency  in 
this  case  is  measured  in  the  execution  time  for  the  program, 
the  amount  of  memory  required  to  store  the  program,  or 
both.  The  desire  for  improved  code  lead  to  the  development 
of  the  science  of  code  optimization.  The  field  of 
optimization  has  been  explored  [1,3,6,7,10]  and  several 
basic  theories  have  evolved. 

One  important  aspect  of  optimization  is  register 
allocation.  The  register  allocation  problem  is  the  problem 
of  assigning  data  items  to  registers  so  that  the  resulting 
code  is  efficient.  The  extension  of  the  problem  is  to 
coordinate  register  assignments  made  on  different  tranches 
and  to  determine  data  item  replacement.  Techniques  will  be 
presented  for  examining  these  two  problems. 

The  discussions  which  follow  will  relate  to  a  class  of 
computers  which  are  characterized  by  having  a  set  of  general 
purpose  registers.  The  methods  discussed  would  not  apply  to 
stack  machines,  for  example. 

Data  item  allocation  will  be  to  one  of  a  set  of  general 
purpose  registers.  The  methods  will  then  apply  tc  the 
allocation  problem  by  data  item  type.  That  is  to  say,  the 
methods  may  be  applied  to  allocating  floating  point  data 
items  to  a  set  of  floating  point  registers  and  integer  data 
items  to  a  set  of  integer  registers  as  long  as  the  data  item 
type  can  be  identified  by  the  compiler. 

Several  authors  have  addressed  the  problem  of  optimal 
register  allocation,  either  directly  or  indirectly.  It  will 
be  the  purpose  of  this  tretise  to  examine  two  of  these 
techniques  [Day  4,Kildall  7]  with  the  purpose  of  deterrcinina 


the  optinizinq  information  which  is   necessary   for   optimal 
allocation. 


A.    OPTIMAL  REGISTER  ALLOCATION,  DAY  [4] 

One  solution  to  optimal  register  allocation  has  been 
presented  by  Day  [4  J.  Day's  method  is  based  primarily  on 
the  concepts  of  data  item  interference  and  profit. 

Day  defines  a  data  item  as  a  constant  or  a  data  name. 
"A  data  item  is  defined  when  statement  execution  causes  a 
new  value  to  become  associated  with  the  data  item." 
Constants  are  defined  by  their  representation  and  their 
values  are  not  normally  changed.  "A  data  item  is  referred 
to  when  the  current  value  of  the  data  item  is  required  for 
correct  statement  execution."  Having  the  current  value 
required  for  correct  statement  execution  implies  that  the 
value  is  at  least  temporarily  in  a  register.  "A  data  item 
is  active  at  a  point  in"  a  region  "if  it  may  be  referenced 
subsequent  to  that  point."  Day  defines  a  reqion  to  be  a 
stronqly  connected  subgraph  of  the  program  when  represented 
as  a  directed  qraph.  A  stronqly  connected  subqraph,  by 
definition,  means  that  any  node  in  the  subqraph  can  be 
reached  from  any  other  node  in  the  subqraph.  Because  of 
this  characteristic  of  stronq  connectivity  when  combined 
with  the  definition  of  an  active  data  item,  a  data  item  is 
necessarily  active  over  an  entire  reqion  if  it  is  active  at 
any  point  in  the  reqion.  Strongly  connected  reqions 
intuitively  correspond  to  nested  loop  structures  in  the 
source  proqram.  Day  states  that  "two  data  items  interfere 
in"  a  reqion  "if  they  are  both  active  at  a  point  in"  the 
reqion.  Extension  of  the  concept  of  active  data  items 
implies  that  any  data  items  active  in  a  reqion  must 
necessarily  interfere  with  any  other  active  data  item  in  the 
reqion  at  all  points  in  the  reqion.  Since  a  reqion  is 
stronqly  connected,  all  points  in  the  reqion  are  necessarily 
subsequent  to  every  other  point  in  the  reqicn. 


The  principal  characteristic  of  interfering  data  items 
is  that  if  they  are  allocated  to  the  same  register,  at  some 
point  they  will  both  be  active,  by  the  definition  of 
iterference,  and  may  not  be  allocated  to  the  same  register 
at  that  point.  Likewise,  the  characteristic  of 
non-interfering  data  items  is  that  at  no  point  are  any  two 
non-interfering  data  items  active. 

The  concept  of  profit  is  a  numerical  "representation  of 
the  improvement  in  program  execution  that  may  occur  if  the 
data  item  is  globally  assigned  to  a  register  being 
processed."  The  comparative  values  of  the  profits  ar€  the 
deciding  factors  in  making  the  assignment  of  a  data  item  to 
a  register.  "The  values  assigned  to  the  profit  equation 
constants  determine  whether  the  profit  represents  a 
projected  improvement  in  program  size  or  execution  time." 
Day  assumes  "the  profit  of  a  particular  global  assignment  to 
be  the  sum  of  those  data  items  therein  assigned  to 
registers."  In  terms  of  the  analysis,  the  method  is  to 
maximize  ever  all  possible  assignments  to  identify  the 
assignment  with  the  largest  profit  value. 

A  basic  block  is  an  ordered  set  of  statements 

($1, S2, S3 ,. .. ,Sk} 

which  is  entered  only  through  S1  and  branched  from  only  at 
Sk  and  where  Si  is  executed  before  Si+1.  Day  uses  this 
definition  in  the  discussion  of  both  lecal  and  global 
assignment.  "Local  and  global  assignment  differ  in  the 
extent  of  the  program  over  which  the  assignment  of  data 
items  to  registers  is  effective:  local  assignment  cccurs 
within  a  basic  block,  while  global  assignment  occurs  fcithin 
a  region." 

The  desirability  of  global  assignment  stems  from  the 
weakness  of  the  more  easily  conducted  local  assignment.  Day 
states  that  one  "weakness  in  local  assignment  involves  the 
disposition  of  data  items  that  are  defined  or  referred  to  in 
a   block   and  are  active  on  entry  to  or  exit  from  the  block. 


Local  assignment  cannot  usually  retain  assignment  history 
across  block  boundaries,  and  so  the  values  of  active  data 
items  must  be  moved  to  main  storage  for  interblock  transfers 
of  control." 

Day  introduces  three  types  of  allocation:  cne-one, 
many-one,  and  many-few.  "A  cne-one  assignment  defines  a 
one-to-one  correspondence  between"  the  data  items  and  the 
registers.  A  weakness  in  global  one-one  assignment  is  that 
it  is  usually  incapable  of  assigning  more  than  one  data  item 
to  a  register  in  a  region.  Day's  approach  to  the  solution 
of  the  problem  "is  to  consider  a  set  of  data  items  for 
assignment  to  a  register  if  no  two  data  items  in  the  set 
interfere  at  any  point  in  the  region."  Day's  global 
many-few  assignment  method  has  this  characteristic. 
Many-few  assignment  is  a  single  valued  mapping  of  a  subset 
of  the  data  items  in  a  program  onto  a  set  of  registers  where 
the  number  of  data  items  competing  for  assignment  is  greater 
than  the  number  of  available  registers. 

Day  presents  a  solution  method  for  the  global  many-few 
problem  utilizing  matrix  construction  and  multiplication  to 
implement  the  interference  characteristics  of  the  data 
items. 

B.    GLOBAL  EXPRESSION  OPTIMIZATION,  KIIDALL  [7] 

Kildall  conducts  an  analysis  of  program  structure  in 
order  to  produce  optimized  object  code.  Kildall  utilizes  a 
directed  graph  to  represent  the  program  flow,  along  with  an 
"optimizing  pool,"  an  "optimizing  function,"  and  a  "meet 
operation"  to  conduct  his  analyses. 

An  optimizing  pool  is  associated  with  each  node  in  the 
graph..  The  nature  of  the  pool  is  an  arbitrary  set  which 
describes  the  optimizing  information  associated  with  a 
particular  node  in  terms  of  the  analysis  being  conducted. 

An  optimizing  function  maps  an  "input  pool"  and  the 
optimizing  pool  of  a  node  to  a  new  "output"  pool.    In  every 


instance,  the  input  pool  for  a  node  is  derived  from  the 
output  pcols  for  the  node's  immediate  predecessors.  The 
output  pool  of  a  node  contributes  to  the  input  pool  of  the 
node's  immediate  successor (s) . 

The  meet  operation  is  defined  to  handle  the  problem  of 
combining  two  or  more  input  pools  at  a  point  where  two  or 
more  program  flows  join,  and  varies  for  differing  types  of 
analysis.  The  meet  operations  defined  are  binary, 
associative,  and  commutative.  The  meet  operation  is  a 
mapping  of  the  set  of  all  optimizing  pools  onto  itself. 
This  can  be  represented  as: 

PXP-»P 
(where  P  is  the  set  of  all  optimizing  pools) . 

Kildall  defines  several  types  of  analysis.  Two  of  his 
analyses  are  of  primary  interest  and  will  be  reviewed 
below.  The  two  are  common  subexpression  elimination  and 
live  variable  analysis. 

For  common  subexpression  analysis,  the  pool  of  computed 
expressions  is  partitioned  into  equivalence  classes  whose 
members  are  known  to  have  identical  values.  The  optimizing 
function  for  common  subexpression  analysis  manipulates  the 
equivalence  classes  of  the  partition.  "Two  expressions  are 
placed  into  the  same  class  of  the  partition  if  they  are 
known  to  have  equivalent  values."  The  meet  operation  for 
common  subexpression  analysis  is  intersection  by  equivalence 
classes. 

For  live  variable  analysis  a  reversed  program  flew  graph 
is  used  for  the  analysis.  At  any  point,  the  pool  associated 
with  a  node  is  the  set  of  data  items  which  may  be  referenced 
subseguent  (in  the  forward  direction)  to  the  node.  The 
optimizing  function  for  live  variable  analysis  has  two 
distinct  characteristics.    These  are: 

"1.  If  the  expression  at  node  N  involves  an  assignment 
to  a  variable,  let  d  be  the  destination  of  the 
assignment;  set  P<— P-{e|d  is  a  subexpression  in 
e}  (d  and  all  expressions  containing  d   become   dead 
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expressions) "    (e    is   the   set   of   all   partial 
computations  at  the  current  node.) 
"2.  Consider   each  partial  computation  e  at  node  N.   Set 
F<— P  {e}  The  value  of  the   optimizing   function   is 
altered  to  the  value  of  P." 
The  meet  operation  for  live  variable   analysis    is   set 
union. 

For  completeness,  Kildall's  flow  analysis  algorithm  is 
presented  below..  The  following  notation  will  be  used  in 
the  presentation:  P  is  the  set  of  all  possible  optimizing 
pools.  E  is  an  entry  pool  set.  2  1S  "the  unit  element 
for  the  analysis  being  conducted. 
A1[  initialize]   L  <-  E 

A2[ terminate?]    If  L=p    then  HALT 

A3[ select  node]   Let  L« 6L,L  •=  (N, Pi)  for  some  N6N  and 

PieP,  L«-L-{L'} 


AU[ traverse? ] 


Let  Pn  be  the  current  approximate  pool 
of  optimizing  information  associated 
with  the  node  N  (initially  Pn=J)  .   If 
PnfPi  Go  To  step  A2. 


A5[set  pool] 


Pn«-PnAPi,L«-LU{N'  ,f  (N,  Pn)  )  |  N«ei  (N)  } 


A6[ loop  ] 


Go  To  step  A2. 


Examples  of  optimizing  pools,  an  optimizing  function, 
and  a  meet  operation  are  presented  in  section  V.  The  term 
global  will  be  used  henceforth  to  refer  to  an  entire  program 
and  not  just  a  region. 

By  utili2ing  Kildall's  methods,  the  data  item  concept 
can  be  expanded  to  include  expressions.  This  extension  is 
desirable  because  a  repeated  expression  would  have  to  be 
recomputed    if    allocation   were   only   to   variables   and 
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constants.  Kildall  [7]  presents  a  data  structure  which  may 
be  used  for  manipulating  the  data  items  under  the  expanded 
definition. 
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III.   THE  CONCEPT  OF  PROFIT 

The  concept  of  profit  is  essential  to  register 
allocation.  Day's  definition  of  profit  is  a  linear 
comtinaticn  of  the  number  of  definitions  of  and  references 
to  a  data  item  in  a  region. 

The  expanded  definition  of  data  item  may  mean  that  the 
number  of  references  may  not  accurately  and  completely 
reflect  the  value  of  a  data  item  (an  expression,  for 
example)  in  a  register.  It  will  not  be  the  purpose  of  this 
paper  to  specify  an  explicit  profit  function.  However,  the 
contributing  factors  of  the  profit  function  under  the 
expanded  data  item  definition  will  be  discussed  below. 

In  general,  it  is  assumed  that  the  profit  should  reflect 
a  measurable  guantity.  If  the  optimization  is  to  be  toward 
program  size,  the  profit  function  should  be  a  measure  cf  the 
relative  number  of  instructions  reguired  to  execute  the 
resulting  code.  If  the  optimization  is  to  be  toward  program 
run  time,  the  profit  function  should  be  a  measure  cf  the 
execution  time  of  the  resulting  instructions.  The  two 
concepts,  of  course,  are  often  closely  related. 

The  profit  should  increase  with  the  number  of  references 
to  a  data  item  over  an  active  region.  If  a  high  profit 
results  from  this  factor,  it  would  imply  a  decrease  in  the 
number  of  load  operations  (and  combining  operations  in  the 
case  of  expressions) .  This  factor  would  then  imply  a 
decrease  in  program  size  and,  depending  on  the  operation 
times,  often  leads  to  savings  in  run  time. 

The  profit  should  increase  with  the  number  of 
instructions  necessary  to  replace  the  data  item  in  a 
register,  due  to  the  fact  that  the  data  item  concept  is 
extended  to  include  arbitrary  expressions.  This  factcr  is 
called  "complexity,"  since  the  profit  is  related  to  the 
•   complexity  of  the  data  item.     To  reduce  run   time,   profit 
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should  assign  different  values  to  the  operations  as  to 
execution  time.  The  slower  execution  times  of  a  particular 
operation  would  lead  to  a  higher  profit  since  it  would 
require  more  time  to  replace  the  data  item  in  a  register. 
Exponentiation  would  be  weighted  more  heavily  than  addition, 
for  example. 

The  profit  should  decrease  with  increased  distance  to 
the  next  reference,  thus  preventing  highly  complex  data 
items  from  holding  a  register  over  long  program  flows 
without  reference. 

The  profit  should  be  adjusted  with  program  flow 
information,  when  available.  Logically,  a  data  item  on  a 
highly  executed  branch  would  have  higher  value  in  a  register 
than  a  similar  data  item  on  a  seldom  executed  branch. 
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IV.   ACTIVE  DA1A  ITEM  EXTENSION 

One  of  the  primary  attributes  of  Day's  analysis  deals 
with  the  concept  of  interfering  data  items.  A  problem  with 
Day's  definition  of  an  active  data  item  when  combined  with 
his  definition  of  a  region  was  mentioned  above.  It  becomes 
desirable  to  make  a  new  definition  of  an  active  data  item  to 
correct  that  problem.  Intuitively,  a  data  item  is  active 
between  a  definition  of  the  data  item  and  the  last  reference 
to  the  value  of  the  data  item  which  was  thereby  defined.  In 
terms  of  register  allocation,  this  definition  may  be  stated 
as:  a  data  item  is  active  at  all  points  where  the  value 
of  the  data  item  must  exist  or  have  existed  in  a  register 
for  proper  statement  execution  and  remains  active  to  the 
last  reference  to  the  data  item  for  which  the  value  which 
existed  in  a  register  would  yield  correct  execution. 
Informally,  in  terms  of  registers,  a  variable  becomes  active 
when  the  associated  value  exists  in  a  register  and  remains 
active  over  the  range  to  the  last  point  at  which  it  is 
referenced  prior  to  redefinition  or  program  termination.  In 
other  words,  it  is  the  range  over  which  a  data  item 
maintains  the  value  which  was  at  one  time  loaded  into  a 
register. 

To  an  extent,  live  variable  analysis  corresponds  to  this 
definition.  Variables  are  included  as  being  live  over  the 
range  from  which  they  are  assigned  a  value  by  an  executable 
statement  to  the  point  of  their  last  reference  prior  to 
redefinition  (by  an  executable  statement)  or  program 
termination.  Live  variable  analysis  departs  from  the 
definition  given  for  active  data  items  in  two  ways. 

The  first  departure  of  live  variables  from  active 
variables  comes  from  the  case  of  data  items  which  are 
implicitly  defined.  Implicit  definition  may  be  made  ty  the 
representation  (in  the  case  of  constants) ,   by   compile-t ine 
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assignments  (e,g.#  the  FORTRAN  "DATA"  statement),  or  by 
default  memory  initialization.  Implicitly  defined  data 
items  are  evaluated  as  live  from  program  entry  to  the  last 
reference  to  the  data  item  (with  possible  non-active 
sections  interspersed).  For  implicitly  defined  data  items, 
however,  the  value  associated  with  the  data  item  does  not 
exist  in  a  register  until  the  first  reference.  Implicitly 
defined  data  items  are,  therefore,  active  frcm  their  first 
reference  to  their  last  reference  or  redefinition. 

The  second  departure  of  live  variables  from  active 
variables  may  occur  from  a  READ-type  statement.  Depending 
on  the  machine  configuration  and  the  data  manipulation  for  a 
READ-type  statement,  the  data  item  read  may  or  may  not  have 
existed  in  a  usable  form  in  a  register.  Live  variables 
begin  a  live  segment  with  the  definition  by  a  RZAE-type 
statement.  Eepending  on  the  data  manipulation,  a  READ 
statement  may  or  may  not  originate  an  active  program  segment 
for  that  data  item. 
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V.   REFERENCED  DATA  ITEM  ANALYSIS 

In  order  to  analyze  data  item  interference  based  on  the 
revised  definition  of  active  data  items,  referenced  data 
item  analysis  is  introduced.  The  purpose  of  referenced 
data  item  analysis  is  to  provide  a  method  which  determines 
data  items  possessing  the  characteristics  of  active  data 
items  net  possessed  by  live  variables  so  that  the  active 
data  items  may  be  determined.  In  particular,  referenced 
data  item  analysis  produces  sets  of  data  items  which  have 
previously  appeared  in  a  register. 

The  optimization  pool  for  referenced  data  item  analysis 
is  the  set  of  all  data  items  which  have  been  referenced 
previous  to  the  current  point  in  the  program  flow.  The 
optimizing  function  performs  a  union  of  all  data  items  in 
the  expression  at  the  current  node  with  the  input  set  of 
referenced  data  items.  Thus  for  an  expression  R=A+E  at  a 
node  N  with  an  input  pool  of  {X,Y,Z} 

F(N,  {X/Y/Z})  =  {X,Y/Z}  U  {R,A,E}  =  {ErA,E,X,Y,Z} 
where  F(NrPn)  is  the  optimizing  function  operating  on  node  N 
and  the  corresponding  input  pool  Pn. 

Active   data   items   are   not  dependent  upon  the  program 
branch  structure.    An  active  data  item   is   active   from   a 
first   reference   (loaded   into   a   register)   to   the  final 
reference,  with  possible   inactive   segments   interspersed. 
The  meet  operation  is,  therefore,  set  union. 

As  discussed  above,  the  inclusion  of  a  variable  in  the 
referenced  data  pools  which  participate  in  a  REAL-type 
statement  will  be  dependent  upon  the  machine  for  which  the 
output  code  is  intended. 

As  discussed  in  section  IV,  live  variables  have  the 
characteristics  of  active  variables  with  the  exception  that 
their  pcint  cf  entry  into  the  set  of  active  data   items   may 
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overextend  the  definition  point.  Referenced  data  items  have 
the  characteristics  of  active  data  items  except  that  the 
data  items  may  (and  in  general  will)  extend  past  the  last 
reference.  Intersection  of  pools  for  these  two  analyses  at 
every  point  in  the  program  flow,  then,  will  produce  sets  of 
data  items  which  are  active  at  that  point.  It  should  be 
noted  that  a  single  forward  pass  is  insufficient  for  active 
variable  analysis  because  on  a  forward  pass  the  current 
reference  to  a  variable  is  not  known  to  be  the  last. 
Similarly,  a  single  reverse  pass  is  insufficient  because  the 
current  reference  is  not  known  to  be  the  first  reference. 
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VI.   A  PROBLEM  WITH  ALLOCATION 

The  problem  of  register  allocation  is  now  considered. 
There  are  certain  necessary  requirements  which  are  implicit 
because  of  the  interference  characteristics  of  the 
variables.  The  nature  of  these  necessary  requirements  is 
that  no  two  mutually  active  variables  may  be  assigned  to  the 
same  register.  There  are  some  desired  characteristics 
which  are  derived  from  the  branching  structure  of  the 
program.  The  nature  of  the  desired  characteristics  is  that 
variables  active  on  several  branches  be  in  the  same  register 
at  the  point  where  the  branches  join.  The  necessary  and 
the  desired  characteristics  will  now  be  discussed. 

At  any  point  in  the  program,  only  one  data  item  may  be 
allocated  to  a  register.  Day  [4]  allocates  a  set  of 
non-interfering  data  items  to  a  register.  The 
characteristics  of  non-interfering  data  items,  however, 
imply  that  only  one  data  item  at  any  given  point  will  have 
value  in  the  register  (be  active  at  that  point) .  Therefore, 
although  Day  allocates  a  set  of  data  items,  at  any  point 
only  one  member  of  that  set  will  be  in  a  register. 

Further,  all  active  data  items  have  value  in  a 
register.  If  the  number  of  active  data  items  is  greater 
than  the  number  of  registers,  then  a  selection  must  be  made 
of  the  data  items  to  be  allocated.  Profit  is  the 
measurement  used  to  make  the  selection.  The  selection  may 
be  made  by  reducing  the  set  of  active  data  items  at  each 
node  to  the  M  most  profitable  data  items,  where  M  is  the 
number  of  registers.  The  necessary  requirements  will  be 
derived  from  these  reduced  active  data  item  pools.  That  is, 
no  two  members  of  any  reduced  active  pool  at  any  point  may 
be  allocated  to  the  same  register  at  that  point. 

It  is  desirable  that  data  items  allocated  on  different 
branches   and  are  active  at  a  point  where  the  branches  join, 
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be  allocated  to  the  same  register  on  each  branch.  By 
meeting  this  desired  characteristic,  the  register  will  hold 
the  correct  value  of  the  data  item  regardless  of  the  branch 
taken  to  reach  the  point  where  they  join.  The  desired 
characteristics  apply  only  in  the  event  the  data  item  is 
active  en  all  branches  which  join  at  a  point.  If  the  data 
item  was  not  active  just  prior  to  the  join  point,  the 
contents  of  a  register  would  be  dependent  upon  the  tranch 
taken  at  run  time. 

Figure  I  is  now  presented  to  illustrate  the  desired  and 
the  necessary  characteristics  of  a  program  segment.  There 
are  three  variables  in  the  example,  X,Y  and  U.  For  purposes 
of  the  example,  Ra  (d)  and  Rb  (d)  will  represent  the  register 
allocated  to  the  data  item  represented  by  d  en  branch  a  and 
b  respectively.  R1  and  R2  will  represent  the  actual 
registers  in  the  two  register  machine. 


branch  a 


branch  b 


FIGURE  I. 
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X  and  U  are  active  prior  to  point  1  and  over  all  of 
branch  a.  X  is  active  on  branch  b  to  point  2  and  from 
point  3  to  point  4.  Y  is  active  on  branch  b  from  point  1  to 
point  3.   U  is  active  on  branch  b  from  point  2  to  point  U . 

The  necessary  requirements  in  the  example  are  shown 
belcw. 

1  Ra(X)*Ra(U) 

2  Rb(X)*Rb(Y) 

3  Rb(U)*Rb(Y) 

4  Rb(U)*Rb(X) 

The  desired  characteristics  in  the  example  are: 

1  Ra(X)=Rb(X) 

2  Ra(U)=Rb(U) 

Another  set  of  constraints  stems  from  the  activity 
characteristics  of  the  variables  in  conjunction  with  the 
necessary  requirements.  At  point  1 ,  for  example, 
Rb(X)*Rb(Y)  and  at  point  2  Rb  (U)  *Rb  (Y)  .  Since  there  are 
only  two  reqisters  in  the  machine,  these  requirements 
combine  tc  imply  that  Rb  (X) =Rb  (U)  . 

More  specifically,  the  reqister  released  by  a  data  item 
when  it  becomes  inactive  is  subsequently  used  by  another 
data  item  when  it  becomes  active.  Thus  the  reqister  used  by 
the  first  item  is  "equated  to"  the  reqister  of  the  second 
data  item.  Due  to  the  fact  that  the  activity  of  the  data 
items  must  net  conflict,  this  action  is  termed  "conplement 
equation. " 

The  complement  equation  characteristics  of  the  example 
are: 

1  Eb(U)=Rb(Y) 

2  Rb(X)=Rb(U) 

3  Rb(Y)  =Rb(X) 

The  complement  equation  characteristics,  the  necessary 
requirements,  and  the  desired  characteristics  combine  to 
reach  a  contradiciton.  The  contradiction  may  be  represented 
by  : 
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Ba(X)=R1    Arbitrary 

Ra  (U) =  R2    Necessary  Requirement  1 

Rb(Y)=R2    Complement  Equation  1 

Rb(X)-R1    Necessary  Requirement  2 

Rb(U)-R1    Complement  Equation  2 

Rb  (X) =R2    Necessary  Requirement  4 

Rb(X)=R1    Desired  Characteristic  1 
Note  that  the  last   two   assiqnments   are   in   conflict. 
Arbitrarily    settinq    Ra  (X) =R2   would   lead   to   the   same 
contradiction. 

The  effect  of  this  contradiction  is  that,  if  an 
alteration  is  not  made  to  at  least  one  of  the 
characteristics,  a  reference  to  X  or  U  after  point  4  would 
require  reloadinq  of  the  variable  being  referenced. 
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locally   using  globally  derived  information,  using  Kildall's 
technigues  [ 7  ]. 

B.  COMPLEMENT  EQUATION  ANALYSIS 

The  concept  of  complement  eguation  was  introduced  in 
section  VI.  The  projected  purpose  of  complement  eguation 
analysis  is  to  specify  the  complement  eguation  constraints 
of  a  program.  Complement  eguation,  being  based  en  the 
activity  characteristics  of  the  data  items,  should  operate 
on  the  sets  of  active  variables.  By  comparing  the  pocls  of 
active  data  items  from  node  to  node,  the  changes  in  the 
pools  represent  the  registers  of  the  data  items  which  are 
complement  eguated. 

C.  DESIRABLY  EQUAL  REGISTERS 

Desirably  egual  register  assignments  occur  where  tv»o  or 
more  program  flows  join.  The  program  characteristic  which 
leads  to  desired  register  eguation  is  as  follows:  registers 
of  data  items  are  desirably  egual  if  they  are  active  en  two 
or  more  tranches  prior  to  a  join  point,  and  are  active  at 
the  join  point,  Specification  of  this  situation  may  be  made 
by  intersecting  the  active  pools  prior  to  the  join  point 
with  the  active  pool  at  the  join  point.  Data  items  which 
are  in  the  intersection  are  then  desirably  in  the  same 
register  en  all  branches. 

D.  REGISTER  PRELOADING 

Preloading  is  the  process  of  loading  a  register  prior  to 
a  join  point.  Preloading  may  be  desirable  in  two 
situations. 

The  first  situation  arises  from  the  expanded  definition 
of  data  items  to  include  expressions.  The  situations  may  be 
detected   by   comparing   the   complexity    cf    the    coramon 
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subexpression  pools  from  pre-join  nodes  to  the  join  node. 
If  the  complexity  of  the  pool  increases,  the  preloading  may 
be  profitable.   For  example,  if  the  pool  structure  is 


{X,A+B} 


{Y,A+B} 


{A+B} 
it  may  be  worthwhile  forcing  A+B  to  a  register  at   the   join 
point.  Tc  load  A+B  at  the  join  point  would  reguire 

LOD    A 
ADD    B 
However,  prior  to  the  join  point,  A+B  may  te  loaded   into   a 
register   by   loading   either   X   or   Y   reguiring   cnly  one 
operation. 

The  second  situation  for  which  preloading  may  be  of 
value  occurs  when  several  branches  join  and  a  data  item  is 
active  en  several,  but  not  all  of  the  branches  joining  at 
that  point.  By  preloading  the  data  item  on  the  branches  on 
which  it  is  not  active,  the  code  on  the  branches  nay  be 
fully  utilized.   Thus  the  code 


LOD  A 
ADD  B 
STO   X 


LOD  A 
ADD  B 
STO   X 
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could  be  modified  to 

LOD  A 

ADD  B 

STO  X 


LOD 

A 

ADD 

B 

STO 

Y 

LOD 

A 

LOD 

A 

ADD 

B 

ADD 

B 

STO 

X 
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in  which  case  the  correct  value  of  A+B  would  be  in  a 
register  and  could  be  stored  at  the  join  point.  The 
resulting  cede  in  this  case  would  have  the  same  program 
size,  but  would  execute  in  a  shorter  time,  especially  if  the 
right  branch  were  seldom  executed. 


E. 


ALTERATION  OF  THE  COMPLEMENT  EQUATION  CHARACTERISTICS 


The  complement  equation  characteristics  are  the 
constraints  which  must  be  relaxed.  That  is,  given  that  a 
contradiction  exists,  the  necessary  characteristics  cannot 
be  changed,  and  while  the  desirably  equal  registers 
constraints  may  be  altered,  this  could  lead  to  excessive 
load-store  operations.  The  opportunities  for  altering 
these  charcteristics  exist  in  three  forms.  If  there  is  a 
node-to-node  change  in  the  active  pools  of  two  or  mere  data 
items,  the  newly  active  data  items  may  be  loaded  intc  any  of 
the  vacated  registers.  If  there  are  fewer  than  M  active 
data  items  at  a  node  (where  M  is  the  number  of  registers) , 
then  the  complement  equation  characteristics  may  be  altered 
by  performing  a  register-to-register  move.  The  third 
alteration  made  be  made  at  any  point  by  storing  the  current 
register  contents  to  a  temporary  location,  performing  a 
register-to-register  movement,  and  loading  the  vacated 
register  from  the  temporary  location. 

The  purpose  of  altering  the  complement  eguation 
characteristics  is  to  satisfy  the  desirably  equal  register 
constraints.  When  the  alterations  are  made  at  a  cost,  such 
as  storing,  performing  a  register-to-register  move,  and 
loading,  the  cost  would  have  to  be  balanced  against  the 
advantages  gained  by  satisfying  the  desirably  equal  register 
characteristics. 
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X.   CONCLUSIONS 

Using  Kildall's  [7]  algorithm,  a  method  was  presented 
for  specifying  the  active  data  items.  The  concept  of 
reducing  the  active  data  items  at  each  node  to  the  M  most 
profitable  (where  M  is  the  number  of  registers)  was 
introduced.  A  description  of  the  nature  of  the  desired  and 
necessary  characteristics  of  allocation  was  presented. 

The  existence  of  the  contradictions  specified  in 
section  VI  implies  that  there  may  not  be  a  satisfactory 
solution  to  every  register  allocation  problem.  If  there  is 
no  universally  satisfactory  solution,  the  problem  then 
becomes  a  linear  programming  problem.  As  in  Day's  solution, 
the  problem  may  be  informally  stated  as: 

MAXIMIZE:   Profit 
SUBJECT  TO:   1.  Necessary  Eeguirements 

2.  (Desired  Characteristics) ' 

where  (Desired  Characteristics)  *  may  be  a  proper  subset  of 
the  desired  characteristics  of  the  program.  Profit  in  this 
case  is  the  sum  of  the  profits  of  the  data  items  assigned  to 
registers  and  the  eguating  profits  less  the  eguating  costs. 

Maximizing  the  profit  of  the  variables  at  each  node  will 
ensure  that  the  profit  associated  with  the  data  items  is 
maximum.  The  natural  extension  would  imply  that  maximizing 
the  profit  of  eguating  at  each  step  would  also  result  in  a 
maximum  profit  globally.  This  may  not  be  true,  however, 
since  changes  to  the  complement  eguation  characteristics, 
when  made,  are  effective  over  all  nodes  subsequent  to  the 
node  at  which  the  alteration  is  made.  Thus  an  alteration  to 
gain  a  desired  register  equating  may  require  other 
alterations  in  the  complement  equation  characteristics  which 
will   have   an   associated   cost.    None  of  these  statements 
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have  been  formally  specified,  however,  and  remain  as  topics 
for  further  investigation.  In  the  final  analysis,  it 
appears  that  the  techniques  discussed  here  must  be  applied 
somewhat  heuristically  in  an  attempt  to  obtain  a  "good" 
allocation.  This  allocation  may  be  incrementally  imfroved 
but,  considering  the  current  state  of  the  theory,  no 
absolute  statements  are  possible  at  this  time. 
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